189 research outputs found

    Model-based clustering with data correction for removing artifacts in gene expression data

    Full text link
    The NIH Library of Integrated Network-based Cellular Signatures (LINCS) contains gene expression data from over a million experiments, using Luminex Bead technology. Only 500 colors are used to measure the expression levels of the 1,000 landmark genes measured, and the data for the resulting pairs of genes are deconvolved. The raw data are sometimes inadequate for reliable deconvolution leading to artifacts in the final processed data. These include the expression levels of paired genes being flipped or given the same value, and clusters of values that are not at the true expression level. We propose a new method called model-based clustering with data correction (MCDC) that is able to identify and correct these three kinds of artifacts simultaneously. We show that MCDC improves the resulting gene expression data in terms of agreement with external baselines, as well as improving results from subsequent analysis.Comment: 28 page

    A Posterior Probability Approach for Gene Regulatory Network Inference in Genetic Perturbation Data

    Full text link
    Inferring gene regulatory networks is an important problem in systems biology. However, these networks can be hard to infer from experimental data because of the inherent variability in biological data as well as the large number of genes involved. We propose a fast, simple method for inferring regulatory relationships between genes from knockdown experiments in the NIH LINCS dataset by calculating posterior probabilities, incorporating prior information. We show that the method is able to find previously identified edges from TRANSFAC and JASPAR and discuss the merits and limitations of this approach

    Multiclass classification of microarray data with repeated measurements: application to cancer

    Get PDF
    Prediction of the diagnostic category of a tissue sample from its gene-expression profile and selection of relevant genes for class prediction have important applications in cancer research. We have developed the uncorrelated shrunken centroid (USC) and error-weighted, uncorrelated shrunken centroid (EWUSC) algorithms that are applicable to microarray data with any number of classes. We show that removing highly correlated genes typically improves classification results using a small set of genes

    Clustering gene-expression data with repeated measurements

    Get PDF
    Clustering is a common methodology for the analysis of array data, and many research laboratories are generating array data with repeated measurements. We evaluated several clustering algorithms that incorporate repeated measurements, and show that algorithms that take advantage of repeated measurements yield more accurate and more stable clusters. In particular, we show that the infinite mixture model-based approach with a built-in error model produces superior results

    From co-expression to co-regulation: how many microarray experiments do we need?

    Get PDF
    BACKGROUND: Cluster analysis is often used to infer regulatory modules or biological function by associating unknown genes with other genes that have similar expression patterns and known regulatory elements or functions. However, clustering results may not have any biological relevance. RESULTS: We applied various clustering algorithms to microarray datasets with different sizes, and we evaluated the clustering results by determining the fraction of gene pairs from the same clusters that share at least one known common transcription factor. We used both yeast transcription factor databases (SCPD, YPD) and chromatin immunoprecipitation (ChIP) data to evaluate our clustering results. We showed that the ability to identify co-regulated genes from clustering results is strongly dependent on the number of microarray experiments used in cluster analysis and the accuracy of these associations plateaus at between 50 and 100 experiments on yeast data. Moreover, the model-based clustering algorithm MCLUST consistently outperforms more traditional methods in accurately assigning co-regulated genes to the same clusters on standardized data. CONCLUSIONS: Our results are consistent with respect to independent evaluation criteria that strengthen our confidence in our results. However, when one compares ChIP data to YPD, the false-negative rate is approximately 80% using the recommended p-value of 0.001. In addition, we showed that even with large numbers of experiments, the false-positive rate may exceed the true-positive rate. In particular, even when all experiments are included, the best results produce clusters with only a 28% true-positive rate using known gene transcription factor interactions

    Iterative Bayesian Model Averaging: a method for the application of survival analysis to high-dimensional microarray data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microarray technology is increasingly used to identify potential biomarkers for cancer prognostics and diagnostics. Previously, we have developed the iterative Bayesian Model Averaging (BMA) algorithm for use in classification. Here, we extend the iterative BMA algorithm for application to survival analysis on high-dimensional microarray data. The main goal in applying survival analysis to microarray data is to determine a highly predictive model of patients' time to event (such as death, relapse, or metastasis) using a small number of selected genes. Our multivariate procedure combines the effectiveness of multiple contending models by calculating the weighted average of their posterior probability distributions. Our results demonstrate that our iterative BMA algorithm for survival analysis achieves high prediction accuracy while consistently selecting a small and cost-effective number of predictor genes.</p> <p>Results</p> <p>We applied the iterative BMA algorithm to two cancer datasets: breast cancer and diffuse large B-cell lymphoma (DLBCL) data. On the breast cancer data, the algorithm selected a total of 15 predictor genes across 84 contending models from the training data. The maximum likelihood estimates of the selected genes and the posterior probabilities of the selected models from the training data were used to divide patients in the test (or validation) dataset into high- and low-risk categories. Using the genes and models determined from the training data, we assigned patients from the test data into highly distinct risk groups (as indicated by a p-value of 7.26e-05 from the log-rank test). Moreover, we achieved comparable results using only the 5 top selected genes with 100% posterior probabilities. On the DLBCL data, our iterative BMA procedure selected a total of 25 genes across 3 contending models from the training data. Once again, we assigned the patients in the validation set to significantly distinct risk groups (p-value = 0.00139).</p> <p>Conclusion</p> <p>The strength of the iterative BMA algorithm for survival analysis lies in its ability to account for model uncertainty. The results from this study demonstrate that our procedure selects a small number of genes while eclipsing other methods in predictive performance, making it a highly accurate and cost-effective prognostic tool in the clinical setting.</p

    Is acupuncture effective in controlling gagging when taking an alginate impressions?

    Get PDF
    Our community health project aimed to (1) identify the prevalence of gagging among patients attending the Prince Philip Dental Hospital; and to identify socio-demographic variations in reported gagging experiences; and (2) perform a pilot study to evaluate the effectiveness of acupuncture in the control of gagging in the dental setting. Methods: A survey on reported gagging experiences was conducted among patients attending our hospital involving a convenience sample of 225 patients. Participants who reported to previously gag in the dental setting were invited to participate in a pilot study to evaluate the effectiveness of acupuncture in controlling gagging when taking an upper alginate impression. Participants were randomized to receive acupuncture stimulation at a site reported to be effective in the control of gagging on the lower lip (point CV 24) or at a sham site on the upper lip (point GV 26) on their first visit and at their second visit to receive the alternative acupuncture stimulation. Results: The response rate to the survey was 81.3% (183/225). Approximately a third (58/183) reported to have experienced gagging in the dental setting and most frequently encountered this when having a dental impression (among approximately a quarter of participants - 44/183). Half (95/183) reported gagging while performing oral self-care. Four in ten participants (73/183) reported some stress visiting the dentist related to gagging. Sociodemographic variations in reported gagging experiences were evident with respect to age, gender and education level. The response rate to the pilot study was 92.3% (36/39). There was no significant difference in the prevalence of gagging when acupuncture was applied to the test site compared to when acupuncture was applied to the sham site on dental examination (p>0.05) or when taking an upper alginate impression (p>0.05). Conclusions: Gagging in a relative common experience reported by patients attending our hospital – in daily life, in the dental setting and in performing oral self-care. Socio-demographic variations in the prevalence of gagging were apparent. The pilot study does not support the use of acupuncture in controlling gagging in the dental setting.published_or_final_versio

    Incidence of Deep Vein Thrombosis in Hospitalized Chinese Medical Patients and the Impact of DVT Prophylaxis

    Get PDF
    Objective. To evaluate the incidence of deep vein thrombosis in hospitalized Chinese medical patients and the impact of DVT prophylaxis. Methods. All cases of confirmed proximal DVT from 1 January 2005 to 31 December 2008 were reviewed retrospectively to determine the presence of risk factors and whether DVT developed: during hospitalization in medical wards or in case of readmission with a diagnosis of DVT within 14 days of discharge from a recent admission to medical wards. The impact of prophylaxis will be estimated by comparing the annual incidence of proximal DVT among medical patients hospitalized from 2005 to 2007 with that of 2008 (DVT prophylaxis commonly used). Results. From 1 January 2005 to 31 December 2008, 3938 Doppler ultrasound studies were performed for suspected DVT. Proximal DVT was diagnosed in 687 patients. The calculated incidence of proximal DVT among medical patients hospitalized for at least two days was 1.8%, 2%, and 1.7% for the year 2005, 2006, and 2007, respectively. The incidence was 1.1% for 2008 (P < .001). Conclusion. Proximal DVT was substantial in Chinese medical patients, and DVT prophylaxis might reduce such risk
    corecore